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Introduction 



Analyzing whether a time series is stationary or is a non-stationary random walk (unit root 
process) in the sense that the first order differences form a stationary series is an important 
issue in time series analysis, particularly in econometrics. Often the task is to test the unit 
root null hypothesis against the alternative of stationarity at a pre-specified a level, which 
ensures that a decision in favor of stationarity is statistically significant. For instance, 
the equilibrium analysis of macroeconomic variables as established by Granger (1981) and 
Engle and Granger (1987) defines an equilibrium of two random walks as the existence 
of stationary linear combination. When analyzing equilibrium errors of a cointegration 
relationship, rejection of the null hypothesis in favor of stationarity means that the decision 
to believe in a valid equilibrium is statistically justified at the pre-specified a level. For 
an approach where CUSUM based residual tests are employed to test the null hypothesis 
of cointegration, we refer to Xiao and PhiUips (2002). Their test uses residuals calculated 
from the full sample. In the present article we study sequential monitoring procedures 
which aim at monitoring a time series until a time horizon T to detect stationarity as soon 
as possible. 

The question whether a time series is stationary or a random walk is also of considerable 
importance to choose a valid method when analyzing the series to detect trends. Such 
procedures usually assume stationarity, see Steland (2004, 2005a), Pawlak et al. (2004), 
Huskova (1999), Huskova and Slaby (2001), Ferger (1993, 1995), among others. As shown in 
Steland (2005b), when using Nadaraya- Watson type smoothers to detect drifts the hmiting 
distributions for the random walk case differ substantially from the case of a stationary 
time series. 

To detect changes in a process or a misspecified model, a common approach originating 

in statistical quality control is to formulate an in-control model (null hypothesis) and an 

out-of-control model (alternative), and to apply appropriate control charts resp. stopping 

times. Given a time series Yi, Y2, ■ ■ ■ ^ monitoring procedure with time horizon (maximum 
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sample size) T is given by a stopping time — inf{l < t < T : Ut & A} using the 
convention inf = oo, where C/t, called control statistic, is a . . . , Yf) -measurable R- 
valued statistic sensitive for the alternatives of interest, and A C M is a measurable set 
such that {Ut G A] has small probability under the null model and high probability under 
the alternative of interest. In most cases A is of the form (—00, c) or (c, 00) for some given 
control limit (critical value) c. To design monitoring procedures, the standard approach is 
to choose the control hmit to ensure that the average run length (ARL), ARL — E{S^), 
is greater or equal to some prc-specified value. However, controlling the significance level 
is a also serious concern. The results presented in this article can be used to control any 
characteristic of interest, although we will focus on the type I error in the sequel. 

The (weighted) Dickey-Puller control chart studied in this article is essentially based on a 
sequential version of the well-known Dickey- Fuller (DF) unit root test, which is motivated 
by least squares. Due to its power properties this test is very popular, although it is known 
that its statistical properties strongly depend on a correct specification of the correlation 
structure of the innovation sequence. The DF test and its asymptotic properties, particu- 
larly its non-standard limit distribution have been studied by White (1958), Fuller (1976), 
Rao (1978, 1980), Dickey and Fuller (1979), and Evans and Savin (1981), Chan and Wei 
(1987, 1988), Phillips (1987), among others. We will generalize some of these results. To 
ensure quicker detection in case of a change to stationarity, we modify the DF statistic 
by introducing kernel weights to attach small weights to summands corresponding to past 
observations. We provide the asymptotic theory for the related Dickey-Fuller (DF type) 
processes and stopping times, also covering local-to-unity alternatives. 

For correlated error terms the asymptotic distribution of the DF test statistic, and hence 

the control limit of a monitoring procedure, depends on a nuisance parameter, which can 

be estimated by Newey-West type estimators. We consider two approaches to deal with 

that problem. Firstly, based on a consistent estimate of the nuisance parameter one may 

take the asymptotic control limit corresponding to the estimated value. Secondly, following 

Phillips (1987) one may consider appropriate transformations of the processes possessing 
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limit distributions which no longer dependent on the nuisance parameter. A nonparametric 
approach called KPSS test which avoids this problem, at least for 1(1) processes, has been 
proposed by Kwiatkowski et al. (1992). That unit root test has better type 1 error accuracy, 
but tends to be less powerful. Monitoring procedures related to this approach and their 
merits have been studied in detail in Steland (2006). 

The organisation of the paper is as follows. In Section [1] we explain and motivate carefully 
our assumptions on the time series model, and present the class of Dickey-Fuller type 
processes and related stopping times. The asymptotic distribution theory under the null 
hypothesis of a random walk is provided in Section [21 Section |3] studies local-to-unity 
asymptotics, where the asymptotic distribution is driven by an Ornstein-Uhlenbeck process 
instead of the Brownian motion appearing in the unit root case. Finally, in Section H] we 
compare the methods by simulations. 

1. Model, assumptions, and Dickey-Fuller type processes and control 

CHARTS 

1.1. Time series model. Our results work under quite general nonparametric assump- 
tions allowing for dependencies and conditional heteroskedasticity (GARCH effects), thus 
providing a nonparametric view on the parametrically motivated approach. To motivate 
our assumptions, let us consider the following common time series model, which is often 
used in applications. Suppose at this end that {Yf} is an AR(p) time series, i.e., 

Yt = ttiYt-i H h apYt-p + ut, 

for starting values Y^p, . . . , Y^i, where {ut} are i.i.d. error terms (innovations) with E{ut) = 
and al = Var {ut), < o"^ < oo. Assume the characteristic polynomial 

p{z) = 1 — aiz — ■ ■ ■ — Upz'^, z E C, 

has a unit root, i.e., p{l) = 0, of multiplicity 1, and all other roots are outside the unit 
circle, i.e., p{z) = implies \z\ > 1. Then p{z) = p*{z){l — z) for some polynomial p*{z) 
with has no roots in the unit circle implying that l/p*{z) exists for all \z\ < 1. We obtain 
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p{L) = p*{L)AYt = et, where L denotes the lag operator. Since p*{L) can be inverted, we 
have the representation 

(1) Y, = Y,_, + J2Pj^t-,, 

for coefficients {Pj}- This means, {Yt} satisfies an AR(1) model with correlated errors. For 
the calculation of f3j we refer to Brockwell and Davis (1991, Sec. 3.3.) In particular, to 
analyze an AR(p) series for a unit root, one can work with an AR(1) model with correlated 
errors. 

The representation ([T]) motivates the following time series framework which will be assumed 
in the sequel. Suppose we are given an univariate time series {y^ : t = 0,1, . . .} satisfying 

(2) Yt = pYt.i + et, t>l, Yo = 0, 

where p G (—1, 1] is a fixed but unknown parameter. Concerning the error terms {e^} we 
impose the following assumptions. 

(El) {tt} is a strictly stationary series with mean zero and -Eleil^ < oo with the following 
properties: We have 

oo 

^Cov{el,el^t) < oo, 

t=i 

and both {et} and {ef} satisfy a functional central limit theorem, i.e., 

(3) Yl '^^vB{s), 

i<lTs\ 

and 

(4) Yl ie^-Ee^)^v'B^'\s), 

i<lTs\ 

as T — 7- oo, for constants < r],r]' < oo. Here B and 5^^^ denote (standard) 

Brownian motions (Wiener processes) with start in 0. 
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(E2) {et} is a strong mixing strictly stationary times series with £'|ei|^(-'^+'') < oo for 
some S > 0, and with mixing coefficients, a{k), satisfying 

oo 

< oo. 

fe=l 

In assumption (El) and the rest of the paper =^ denotes weak convergence in the space 
D[0, 1] of all cadlag functions equipped with the Skorokhod metric d. 

Remark 1.1. The assumption that {et} satisfies an invariance principle can be regarded 
as a nonparametric definition of the /(O) property ensuring that the partial sums converge 
weakly to a (scaled) Brownian motion B. For a parametrically oriented definition see Stock 
(1994)- Particularly, the scale parameter rj is given by 

T 

(5) rf = lim r]l, 77t = + 2 Y^iT - t)T-^E{e,e,+t) 

t=\ 

Also introduce the notations 

(6) = ?7T/<7^ lim ^T- 

T-)-oo 

// the et are uncorrelated, we have rj^ = a"^ , and — 1. 

As a non-trivial example for processes satisfying (El) let us consider ARCH processes. 

Example 1.1. A time series {Xt} satisfies ARCH(oo) equations, if there exists a sequence 
of i.i.d. non-negative random variables, {^t}, such that 

oo 

Xt ^ pS, = a + ^ bjXt-j 

where a > 0, bj > 0, j — 1,2, .. . This model is often applied to model conditional het- 
eroscedasticity of an uncorrelated sequence {et} with Eet = for all t, by putting Xt = ef. 
A common choice for ^t is to assume that the ^t i.i.d. with common standard nor- 
mal distribution. In Giraitis et al. (2003) it has been shown that an unique and strictly 
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stationary solution exists and satisfies J2k C'oi'(Xi, Xi+fc) < oo, if 

oo 

In addition, under these conditions the functional central limit theorem ^ holds. The rate 
of decay of the coefficients bj controls the asymptotic behavior of Cov{Xi,Xi^k)- If for 
some 7 > 1 and c > we have bj < cj~^ , j = 1,2, ... , then there exists C > such that 
Cov{Xi, Xi+k) < Ck^'^ for k > 1. Thus, depending on the rate of decay (E2) may also 
holds. 

Remark 1.2. Assumption (E2) will be used to verify a tightness criterion. Combined with 
appropriate moment conditions it implies the invariance principles (EP and 

1.2. Dickey-Puller processes. We will now introduce the class of Dickey-Fuller processes 
and related detection procedures. Recall that the least squares estimator of the parameter 
p in model ([2]) is given by 

T T 

t=l t=l 

To test the null hypothesis Hq : p = 1, one forms the Dickey- Fuller (DF) test statistic 

= T{pT - 1) = ^^=' ~ ^'-'^ , 

2^t=l ^t^l 

Suppose at this point that the are uncorrelated. Provided \p\ < 1, < J2t=i ^t-i \ (Pt ~ 
1) -4 A/'(0, 1), as T — i- oo. However, />r has a different convergence rate and a non-normal 
limit distribution, if p = 1. It is known that 

Dr^V^ = (l/2)(i?(l)2 " / ^^''^'"^^^ 
as T ^ oo, see White (1958), Fuller (1976), Rao (1978, 1980), Dickey and Fuller (1979), 
and Evans and Savin (1981). Recall that B denotes standard Brownian motion. Based on 
that result one can construct a statistical level a test, which rejects the null hypothesis 
Hq : p = 1 of a unit root against the alternative ifi : p < 1 if < c, where the 
critical value c is the a-quantile of the distribution of Vi. More generally, we want to 
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construct a detection rule which provides a signal if there is some change-point q such that 
li, . . . , Fg-i form a random walk (unit root process), and Yq, . . . ,Yt form an AR{\) with 
dependent innovations. This means, the alternative hypothesis is Hi = [Ji<q<TH^\ where 
H'f \ 1 < q <T, specifies that 



Yt-i + et, l<t<q, 
pYt-i + et, q<t<T, 

where p e (—1,1). However, for the calculation of the detection rule to be introduced now 
knowledge of a specific alternative hypothesis is not required. 

A naive approach to monitor a time series to check for deviations from the unit root hy- 
pothesis is to apply the DF statistic at each time point using the most recent observations. 
A more sophisticated version of this idea is to modify the DF statistic to ensure that 
summands in the numerator have small weight if their time distance to the current time 
point is large. To define such a detection rule, let us introduce the following sequential 
kernel-weighted Dickey-Fuller (DF) process 

where AYt — Yt — Yt^i. Here and in the following we put 0/0 = for convenience. Note 
that [Ts\ plays the role of the current time point. The non- negative smoothing kernel K 
is used to attach smaller weights to summands from the distant past to avoid that such 
summands dominate the sum. Thus, kernels ensuring that z i— ?> JC(|z|), ^ e M, is decreasing 
are appropriate, but that property is not required. We do not use kernel weights in the 
denominator, since it is used to estimate a nuisance parameter. We will require the following 
regularity conditions for X : R — > R^. 

(Kl) \\K\\^ < oo, / K{z)dz = 1 and / zK{z)dz = 0. 
(K2) K is with bounded derivative. 
(K3) K has bounded variation. 

Note that it is not required to use a kernel with compact support. 



The parameter h = hx is used as a scaling constant in the kernel and defines the memory 
of the procedure. For instance, if K{z) > if z G [—1, 1] and K{z) = otherwise, the 
process Ut looks back h observations. We will assume that 

(8) T/hr ^ C, T oo, 

for some 1 < ( < oo. That condition ensures that the number of observations used by Dt 
gets larger as T increases. Note that the parameter (, which will also appear in the limit 
distributions, could be absorbed into the kernel K. However, in practice one usually fixes 
a kernel K and chooses a bandwidth h relative to the time horizon T. ([S]) is therefore not 
restrictive. 

1.3. Dickey-Fuller type control charts. Since small values of -Dt(s) provide evidence 
for the alternative that the time series is stationary, intuition suggests that the control 
chart should give a signal if Dt is smaller than a specified control limit c. Hence, we define 

St = St{c) = mf{k <t<T: DT{t/T) < c}, inf = oo. 

We will assume that the start of monitoring, k, is given by 

k = [Tk\ , for some k G (0, 1). 

A reasonable approach to choose c is to control the type I error rate a G (0, 1), i.e., to 
ensure that 

(9) lim Po{ST{c)<T)=a, 

T— s>oo 

where Pq indicates that the probability is calculated assuming that {Yt} is a random walk 
corresponding to the null hypothesis Hq : p = 1. 

1.4. DF control chart with estimated control limit. In the next section we will 
show that Dt converges weakly to some stochastic process depending on the nuisance 
parameter 

= lim ^T = Ti/a, 
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and that S't/T converges in distribution to inf{s G [k, 1] : V^i^s) < c}. Hence, if c is chosen 
from the asymptotic distribution via ([9]), c = c('i9) is a function of i}. Therefore, the basic 
idea is to estimate ■(9 at each time point using only past and current data, and to use the 
corresponding hmit. 

Our estimator for i} will be based on a Newey-West type estimator, thus circumventing 
the problem to specify the short memory dynamics of the process explicitly. Let 7(fc) = 
E{etet+k) and denote by r{k) = ■y{k)/ E{ef), k E N, the autocorrelation function of the 
time series {et}. Since et = AYf if p = 1, we can estimate 7(fc) and r{k) under the null 
hypothesis by 

t t 

(10) r,{k)=^t{k)/al 7t(A;) = r^5^Ay;Ay;_,, d^ = t-'J2^^s- 

s=k s=l 

The parameter ^'^ can now be estimated by the Newey-West estimator given by 

m 

(11) ^t=Vt/^l rft=^t+^J2'^{m,t)lK^), 

1=1 

where w{m, i) = [m — i)/m are the Bartlett weights and m is a lag truncation parameter, 
see Newey and West (1987). Andrews (1991) studies more general weighting functions and 
shows that the rate m = o(T^/^) is sufficient for consistency. 

The Dickey-Fuller control chart for correlated time series works now as follows. At each 
time point t we estimate by i!}t and calculate the corresponding estimated control limit 
c{'dt)- A signal is given if Dj- is less than the estimated control limit, i.e., we use the rule 

St = mf{k <t<T: Drit/T) < c(^t)}. 

1.5. DF control chart based on a transformation. Alternatively, one may use a trans- 
formation of Dt, namely 

%t?PEE^ K{{[Ts\-t)/h) 
-^J z^t=i t-i 

It seems that this transformation idea dates back to Phillips (1987). We will show that for 
arbitrary the process Et converges weakly to the limit of Dt for -(9 = 1. Consequently, 
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if c denotes the control limit ensuring that St has size a when -& — 1, then the detection 
rule 

Zt = mi{k <t<T: Erit/T) < c} 
has asymptotic size a for any {}. 

In the next section we shall show that both procedures are asymptotically valid. 

1.6. Extensions to Dickey- Fuller processes. Inference on the AR parameter in the 

unit root case is often based on the t-statistic associated with Dt, which gives rise to Dickey- 
Fuller t-proccsscs. The Dickcy-Fuller t-statistic, top, associated with Dt = T{pt — 1), is 
the standard computer output quantity when running a regression of Yt on l^-i. For a 
sample Yi, . . . , Yt, the statistic top defined as 

toF = {PT - !)/& = T{pT - 1)/{T^t) 

where 




with sl = {T- Y.ti{Yt - PrYt-if. 

The formula for top motivates to scale Dt analogously. Hence, let us define the weighted 
i-type DF process by 

(13) DTis) = Dt{s)/{\Ts\^\ts\), s e (0, 1], 

and Dt{0) — 0. Dt{s) is a weighted version of t^F calculated using the observations 
Yi,. . . , YltsJ, and attaching kernel weights K{{[Ts\ — t)/h) to the tth summand in the 
numerator. The associated detection rule for known '& is defined as 

St = St{c) = mi{k <t<T: DT{t/T) < c(i?)} 

with c(i?) such that limT^,^ -Po(§r(c(^?)) < T) = a. 
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Again, it turns out that the asymptotic hmit of Dt depends on the nuisance parameter d. 
The weighted t-type DF control chart with estimated control limits is defined as 

St = inf{A; <t<T: Drik/T) < c(^t)}. 

Alternatively, one can transform the process to achieve that the asymptotic limit is invari- 
ant with respect to We define 

n.. < ^ n < ^ "^"^TZI' m[Ts\ -t)/h) 

(14) Et{s) = ^^Dt{s) '-^ = , s G (0, 1]. 

We will show that the detection rule 

Zt = mf{k <t<T: Erit/T) < c(l)} 
has asymptotic type I error equal to a for all "i?. 



2. Asymptotic results for random walks 



In this section we provide functional central limit theorems for the Dickey-Fuller processes 
defined in the previous section under a random walk model assumption corresponding to 
the null hypothesis Hq : p = 1 in model ([2]), and the related central limit theorem for 
the associated stopping rules. These results can be used to design tests and detection 
procedures having well-defined statistical properties under the null hypothesis. 

2.1. Weighted Dickey- Puller processes. We start with the following functional central 
limit theorem providing the limit distribution of the weighted DF process Dt{s), s G [0, 1], 
which extends Phillips (1987, Th. 3.1 c). 

Theorem 2.1. Assume the time series {Yt} satisfies model ^ with p = 1 such that (El) 
and (K1)-(K3) hold. Then 
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as T — >■ OO; where the stochastic process 

{K{0)B{sy + C /; B{r)'K'{as - r)) dr - K{C{s - r)) dr} 



(15) V^(s) = 



IoB'{r)dr 



s e (0, 1]; 'D^{0) — 0, is continuous w.p. 1. 



Remark 2.1. Note that the asymptotic limit is distribution-free if and only if rj = a 
which holds if the error terms are uncorrelated. Otherwise, the distribution of depends 
sensitively on 

Proof. If p = 1 we have = AY* and Yt^-^et = (1/2) (Y^^ - Y^_^ - ef) for all t. This yields 
the representation 

Ms) - Ms) 



Dt s) 



Wris) 



(0,1], 



where the D[0, l]-valued stochastic processes Vr, Rt, and Wt are given by 

[Ts] 

Vt{s) = {2[Ts\)-'J2{Y,'-Yl,)K{{lTs\-t)/h), 

t=i 

lTs\ 

Rr{s) = {2[Ts\)-'J2^tmTs\-t)/h), 



t=i 



_ lTs\ 

Wr{s) = lTs\-'J2^l, 

for s G (0, 1]. Let us first show that 
(16) 

as T — >■ oo, where 



sup l-Rr(s) — A*(s)| 0, 
se[K,i] 



^i{s) = y^J\{as-r))dr, se{0,l]. 



Consider 



\E{Rt{s)) - t,{s)\ ^ 



a 



^ 5^ ir(( [Ts\ - t)/h) -s-'j^ Kids -r))dr 
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([8]) ensures that sup^g[^ maxj [([TsJ — i)/h — C(s — i/T)\ = o(l) yielding 



[Ts] 



[Ts] 



K{Cis-r))dr + o{l), 



uniformly in s G [n, 1], because K is Lipschitz continuous and of bounded variation, cf. The- 
orem 3.3(ii) of Steland (2004). It remains to estimate \Rt{s) — E{Rt{s))\. The assumptions 
on {et} ensure that 

Zr{r)=T-^/^Y.^el-Eel)^pB^-\r) 

i=l 

as T — )■ oo, where = Var(ef) + 2 Gov (ef , ef_,_j). Hence, eventually for equivalent 
versions, we may assume that \\Zt — pB^'^^\\oo a.s., for T — i- oo. By (K3) the Stieltjes 
integrals K{C{s-r)) dB^'^\r) and K{C{s-r)) dZrir) are well defined (via integration 
by parts), and 



sup 

s&[k,1] 



K{C{s -r))dZT{r) - p / K{C{s - r)) dB'^'^\r) 



a.s. „ 

0, 



as T — )• oo. Obviously, 



sup \Rt{s) - ERt{s)\ = sup ^— 

s&[k,1] se[K,l] l-L sj 



K{{[Ts\-[Tr\)/h)dZTir) 



pVT 

< sup — — 



B..(.);,-[E£j_ErJ]|^:; 



B^^\r)K{{[Ts\ - [T{dr)\)/h) 



+ sup — — 



[ZT{r)-pB^'\r)]K{{[Ts\ - VTr\)/h)\Zl 



- / [ZT{r)-pB^'\T)]K{{[Ts\-[ndT)\)lh) 
Jo 
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Noting that the total variation of the functions r K{[\Ts\ — \ Tr\]/h), s G [k, 1], T > 1, 
is uniformly bounded, the right side of the above display can be estimated by 

O (^||5(^)lU||ir|u) +0 (^||5(^)|U/ \dK\^ 

+0 (^II^T - pi?(^)l|oo||i^||oo) +0(^^\\Zr- pB^'^W^I \dK\^ 
= Opil/VT) = op(l). 

Therefore, flTB]) holds true. Let us now consider Vr- We will first show that, up to terms of 
order op(l), Vr is a functional of 

UT{r)=T-'/^Y^Tr], re [0,1]. 

Again, under the assumptions of the theorem, Ut converges weakly to 1]B, where B denotes 
Brownian motion and > is a constant. For brevity of notation let 

kT{r;s)=K{{[Ts\ - [Tr\)/h), r, s G [0,1]. 

Integration by parts yields 

1 L^sJ 

Vt{s) = ^^^^{Y,'-Yl,)K{i[Ts\-t)/h) 



T 



2[Ts\ 
T 



2[Ts\ 



kT{r;s)UUr) 



- / U^{r)kT{dr;s] 
=0 Jo 



2s 



r=0 



+ 7r / U^ir)K'{as-r))dr + op{l] 
Jo 



v'K{0)B\s) , C 



+ UUr)K'{as - r)) dr + op(l). 



2s 

Due to (K2) the Op(l) term is uniform in s G [n, 1]. Next note that 

We are now in a position to verify joint weak convergence of numerator and denominator 
of Dt- The Lipschitz continuity of K ensures that up to terms of order op(l) for all 
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(Ai, A2) G the linear combination XiiVxis) — Rt{s)) + X2Wt{s) is a functional of Ut, 
and that functional is continuous. Therefore, the continuous mapping theorem (CMT) 
entails weak convergence to the stochastic process 



Ai 



2s 



^ r K\C{s-r))B\r)dr-'^ f K{C{s~r))dr 

^■5 Jo Jq 



n.2 rs 



+A2^ / B{rfdr. 







This verifies joint weak convergence of [Vr — Rt,Wt)- Hence, the result follows by the 
CMT. (K2) also ensures that e C[0, 1] w.p. 1. □ 

The central limit theorem (CLT) for the detection procedure St, which requires knowledge 
of ^, appears as a corollary. 

Corollary 2.1. Under the assumptions of Theorem \2.1\ 'we have for any control limit c < 

St/T 4 inf{s G [k, 1] : V^{s) < c} 
as T ^ 00, where V^i^s) is defined in fT5\). 

Proof. Observe that by definition of St 

St > X ^ inf Dt{s) > sup -Dt{s) < -c 
for any x G M. Hence it suffices to show that 

P{ sup -Dt{s) < -c) P{ sup -V4s) < -c), 

s(^[k,x] sG[k,x] 

where denotes the limit process given in Theorem 12.11 Using the Skorokhod/Dudley/ 
Wichura representation theorem and a result due to Lifshits (1982), this fact can be shown 
along the lines of the proof of Theorem 4.1 in Steland (2004), if c < 0, since G C[0, 1] 
a.s. For brevity we omit the details. □ 

Let us now show consistency of the detection procedure St = 'mf{k < t < T : DTif/T) < 
c('i9t)}, which uses estimated control limits. 
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Theorem 2.2. Assume (El) and (E2), (K1)-(K3), and in addition that the lag truncation 
parameter, m, of the Newey- West estimator satisfies 

m — I — >■ oo. 

Then the weighted Dickey-Fuller type control chart with estimated control limit, St, is 
consistent, i.e., 

P{St <T)^a, 

as T ^ oo. 

Proof. Note that the equivalence St > T ^ inig^^^^i] Dt{s) / c{d\Ts\) ^ 1 imphes 

(17) P{St <T) = p[ inf < 1 

Let us first show that the function c is continuous. Note that the process 'P^(s) can be 
written as £{s) — ■^~'^J^{s) for a.s. continuous processes S and not depending on ■(?, where 
particularly J^(0) = and 

F{s) = {s/2) I K{Q{s-r))dr/ B^r) dr, s G (0, 1]. 
Jo Jo 

Let ■j?„:n>l}clRbea sequence with 1?^ — >■ as n — >■ oo. Clearly, for each u of 
the underlying probability space with |^(<x')|, |^('^)| < oo, we have 

n oo. Hence, sup^gf^^ij P^„(s) 4 sup^gf^^ij as n oo. Since sup,g[^_i] 

has a continuous density, this is equivalent to pointwise convergence of the d.f. Fn{z) — 
^(sup,e[^,i]2^^„(s) < z) to F{z) = P(sup,g[^_i]r>^*(s) < z), as n oo, for all z e R. 
Hence, 

c{dn) = F-\a) ^ F-\a) = c(r), 

as n — >■ oo. Next we show 

(18) d\Ts\ ^, 
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as T — )■ oo, in 1]. Since for each s G [k, 1] we have i!^its\ ~^ ^, for T — > oo, fidi 
convergence follows immediately. It remains to verify tightness. Recall the definitions f lTU]) 
and (ITTi) and that AFj = ej under i^o- Fix j and consider the process 7[Tsj(j), s G [k, 1], 
which is a functional of {ete^^j : t = j,j + l, . . .}. Clearly, by the Cauchy-Schwarz inequality 
and (El) E\etet-j\'^'^^ < E\et\'^'^'^^ < oo for some 6 > 0. Further, since J^i^o = (^{^s^s-j '■ s < 
t) C = cr(e, : s <t) and J'^,^ = a{eses-j : s > t + k) C J^^^, = a{es : s>t + k-j), 
the mixing coefficients 5(/c) of {etet-j} satisfy 

a{k) = sup _ sup ^ \P{AnB) - P{A)P{B)\ 

< sup sup \P{AnB) - P{A)P{B)\ = a{k- j), 

where {a{k)} are the mixing coefficients of {et}. Due to (El) we can apply Yokohama 
(1980, Th.l) with r = 2 + 25 to conclude that for k < r < s < 1 



E 



lTs\ 
t=[Tr i+1 



2+25 

1+S\ 



0{\s-r\ 



Now the decomposition 

E 

t=[TrJ+l 

and the triangle inequality yield 



T 1 ^^'^ f T T \ 1 ^^"^ 



l^illTsiU) - 7LT.j(j))ll2+2. = 0(s-i|s - rr+^)/(2+2^)) + 0(|1/. - l/r|r(i+^)/(2+2^)) 



0(|s-r 



(l+5)/(2+25)^ 



since, firstly, we may assume < 5 < 1, and, secondly, both s ^ and ^ <5)/(2+25) 
bounded away from and oo for < r < s < 1. Consequently, 
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and therefore Vaart and Wellner (1986, Ex. 2.2.3) implies tightness of the process {s/T^yt-i (j) 
s G [k, 1]} for fixed j > 0. Note that 7[tsj(0) = cr^rsj- triangle inequality we have 

m 

l|v^(%TsJ -%rrj)||2+25 < 2 ^(1 - j/m)2+2^||7LTsJ (j) " 7LTrJ (j) ||2+25 

i=0 

yielding 

Hence, { [TsJ , S'^^^j ) : s G [k, 1]} is tight in the product space, which implies weak conver- 
gence of {'(9[Tsj : s G 1]} to d. The final step is to verify 

(19) inf DT{s)lc{d\Ts\)^ inf V^{s)/c{d), 

sg[K,i] se[K,i] 

as T — )■ oo, since this implies that ( IT7|) converges to P(infsg[K,i] ^^^(s) < c('^)) = a, as 
T — 7- oo. Due to f[T5]) we can conclude that 

(Dt(-),Vj)^(^^^(-),^) 

in the product space {D[ti, 1])^. Note that the mapping 99 : (-D[/s:, 1], (i)^ — ^ (M, given by 

xis\ 

(f{x,y) = inf — — --, x,y e D[k,1], y eR, 
se[K,i] c{y{s)) 

is continuous in all {x,y) G (C[fi;, 1])^. Since G C[0, 1] w.p. 1 and c G C(M), f fT9|) 
follows. □ 

It remains to provide the related weak convergence results for the transformed process Et 
and its natural detection rule Zt = inf {A; < t < T : Exit/T) < c}. 

Theorem 2.3. Assume (E1),(E2), and (K1)-(K3). Additionally assume that the lag trun- 
cation parameter, m, of the Newey-West estimator satisfies 

m = o{T^/'^), T^oo. 

Then, 

Et{s)^Vi{s), m{D[K,l],d) 
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as T — >■ OO; and for the transformed Dickey- Fuller type control chart we have 

Zt/T 4 mf{K <t<l:Vi{t) < c}. 



as T — >■ oo. Particularly, the asymptotic distributions are invariant with respect to i). 
Proof. As shown above, 



as T ^ oo, which imphes that 

DT(s),LTsJ-'5^F/_i,77jr.j,?fT.j ^{V^{s),vVs' / B'{r)dr,rj^a'), 
t=i J -^0 

if T — )■ oo, yielding 



Dt{s) + 



a^-n^s-^ ['K{C{s-r))dr 



□ 



2.2. Weighted Dickey-Fuller f-processes. Let us now derive (functional) central hmit 
theorems for the weighted Dickey-Fuller t-processes and the associated detection rules. We 
start with the process Dt under the random walk null hypothesis. 

Theorem 2.4. Assume (El), and (K1)-(K3). Then 

Dt^V^, in{D[n,l\,d) 

as T ^ 00, where 

\ {m{^)B{sf + /; B{rfK\(^{s -r))dr- ^"^ /J K{C{s - r)) dr} 



V^{s) 



{j;B{rYdr} 



1/2 



for s e (0, 1] and V^i^O) = 0. Here 1^ = rj/a. is continuous a.s. 
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Remark 2.2. Note that again the limit depends on the nuisance parameter and is 
distribution-free if and only if 'd = 1. 



Proof. By definition 



where 



Dt{s) 



Dt(s) 



[Ts\ 



02 



with 



[Ts] 



^-1 t=i 

for s G (0, 1]. Note that for t = 1, . . . , [Ts\ 



C2 _ 



e,{[Ts\)-et = -{p^_Tsi-m-i. 



Hence, we obtain 



r.2 



= lTs\ - 1 ^ " ^^LT.J - 1) ^y^j _ 1 E + (/^L^^J - ^TsJ - 1 ^ 
From the proof of Theorem 12.11 we know that 

sup [Ts\-'J2Yt-i= sup -— / {T-'/%rri)' dr = Op{l) 
se{o,i] TT ^e(o,i] V -^J / Jo 



and 



sup 

se(o,i] 



lTs\ 



< sup 

se(o,i] 



lTs\ 



t=i 



sup \[Ts\-'/\ts\\=0p{1). 

se(o,i] 



Combining these facts with sup^g(o,i] L^-^J Ipl^'^J ~ ~ C'p(l), we obtain 



lTs\ 



t=i 
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where the op{l) term is uniform in s G (0, 1]. Because (El) imphes that 

72(A;) = Gov (e^, e^^;^) = o(l), ^ oo, 

we may apply the law of large numbers for time series (Brockwell and Davis (1991), Th. 
7.1.1) and obtain, since stochastic convergence to a constant yields stochastic convergence 
in the Skorokhod topology. 



(20) 



as T — 7- oo. We shall now show joint weak convergence of {Dt{s), S'^r 
s G (0, 1]. Let (Ai, A2, A3) G - {0} and consider 



X^Dris) + X^Sf^,^ + A3 [Ts\ ' 



s G [k, 1]. 



t=i 



The proof of Theorem 12.11 implies that 



AiDr(s) + A3LTsJ~'J]F,ii^AiI)^(s) + A34 / B{rfdr, 




as T —7- 00. Due to (1201) . we obtain 



X^DT{s) + X2SlT,^+Xs[Ts\-'J2^l,^X^V4s) + X2cr^ + Xs^ / B{r) 





as T —7- 00. Therefore, the GMT implies that 




and 




□ 



We are now in the position to establish consistency of the t-type detection rule 



St = mi{k <t<T ■ Drit/T) < c{dt)}, 
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which uses estimated control hmits. Notice that Theorem 12.41 imphes that c{^) is given by 
Po(mf.gKi]^'?(^) <cW) = "- 

Theorem 2.5. Assume (E1),(E2), (K1)-(K3), and additionally that the lag truncation 
parameter of the Newey- West estimator satisfies 

Then the t-type weighted Dickey-Fuller control chart with estimated control limits, St, is 
consistent, i.e., 

P{St <T)^a, 

as T —7- oo. 

Proof. The result is shown along the lines of the proof of Theorem 12.21 since the process 
is continuous w.p. 1, and is a continuous function of ^. □ 

Finally, for the transformed process Ex and the associated control chart we have the 
following result. 

Theorem 2.6. Assume (E1),(E2), (K1)-(K3), and 

m = o(T^/^), T ^ oo. 
Then the transformed t-type weighted DF process Et, defined in [I4\ ), converges weakly, 

Et^Vi, zn (D[0, 
as T ^ oo, and for the transformed t-type weighted DF control chart we have 

Zt/T a inf{K < s <l:Vi{s) < c}. 
Particularly, the asymptotic distribution is invariant with respect to d. 

Proof. Note that the first term of Et converges weakly to d~^D^., which has the form 
[A{s) — -i?^^ K{C^{s — r)) dr]/[j^ B'^{r) dr]^/'^. Hence, the construction of the correction 
term is as for Et- □ 
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3. ASYMPTOTICS UNDER LOCAL-TO-UNITY ALTERNATIVES 



In econometric applications, the stationary alternatives of interest are often of the form 
< p < 1 with 1 — p small. To mimic this situation asymptotically, we consider a local-to- 
unity model where the AR parameter depends on T and tends to 1, as the time horizon T 
increases. 

The functional central limit theorem given below shows that the asymptotic distribution 
under local-to-unity alternatives is also affected by the nuisance parameter i). However, 
the term which depends on the parameter parameterising the local alternative does not 
depend on i} (or r]). Therefore, if one takes the nuisance parameter into account when 
designing a detection procedure, we obtain local asymptotic power. 

Let us assume that we are given an array {^r,t} = {^T,t ■ 1 < t < T,T E N} of observations 
satisfying 

(21) Yt,o = 0, YT,t = PTYT,t-i + et, t = 1, . . . , T, T > 1, 
where the sequence of AR parameters {pr} is given by 

pr = 1 + a/T, T > 1, 

for some constant a. {et} is a mean-zero stationary 1(0) process satisfying (El). For brevity 
of notation Dt denotes in this section the process with Yt replaced by It,*- 

The limit distribution will be driven by an Ornstein-Uhlenbeck process. Recall that the 
Ornstein-Uhlenbeck process Za with parameter a is defined by 

(22) Zais)= r e<^--'UB{r), se[0,l], 

^0 

where B denotes Brownian motion. 

Theorem 3.1. Assume (El), and (K1)-(K3). Under the local-to-unity model ([HP we have 
for the weighted Dickey-Fuller process 
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as T CO, where the a.s. C[0, l]-valued process is given by 

Km^js) + C /; Zl{r)K'{as - r)) dr - 2a Z!{r)K{as - r)) dr - Kjqs - r)) dr 

{2/s)^Zl{r)dr 

for s G (0, 1], and 'D'^iQ) = 0. Here Za denotes the Ornstein-Uhlenbeck process defined in 
^2^]. Further, 

St/T 4 mi{s e [k, 1] : V^^s) < c}, as T oo. 
Proof. The crucial arguments to obtain joint weak convergence of numerator and denomi- 



nator of Ut have been given in detail in the proof of Theorem 12.11 Therefore, we give only 
a sketch of the proof stressing the essential differences. First, note that 

LTrJ 

Ut{s)=T-'/^Yt,itsI 



-^0 t=i 



for the step function eT(r; s) = (l+a/T)'-'^^-!"!--^*-!, r, s G [0, 1], which has uniformly bounded 
variation and converges uniformly in r, s to the exponential e(r; s) = e°^*~''^. Hence, firstly, 
the stochastic Stieltjes integral crir; s) dSrir) exists (via integration by parts), and, 
secondly, by estimating the terms of the decomposition JJ* ct dSr — Jq ed{rjB) = J^ieT — 
e) dijjB) + Ct di^Sr — rjB) we see that 



Ut(s) 



PS PS 

/ erir; s) dSxir) ^ rj / e{r] s) dB{r) = r]Za{s) 
Jo Jo 



as T — )■ oo. Next, note that in the local-to- unity model we have 
for all 1 < t < T. This yields the decomposition 

3 
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where for s G (0, 1] 

V2AS) 



[Ts] 



^ J2'^S,t-iKii[Ts\-t)/h), 



2pt[Ts\ 



1 LTsj 

VsAs) = -7T-J7fT-Tj2'tK{{[Ts\~t)/h) 



Wt{s) 



2pt[Ts\ ^ 

n I 2 ^T,t-V 



The term Vi^t can be treated as in the proof of Theorem 12.11 namely, 



1 

2s 



/ Ul{r)K\C{s-r))dr + op{l] 
Jo 

From the proof of Theorem 12. II we know that due to (El) 



sup 



^3,t(s) + - / K{as-r))dr 



^0, 



as T — )■ 00. Consider now V2,t- By definition of pr we obtain 
1-Pt 1 



2,T 



-2a - aVT 1 
2(l + a/r) TLTsJ ^ 



J2'^lt-iKii[Ts\-t)/h), 



= -HsW Zl{r)K{as~r))dr+'^K{Q)Zl{s) + Op{l), 

where due to (K2) the op(l) term is uniform in s G (0,1]. Hence, Vi^t, ^2,T5 and Wt 
are functionals of Ut up to terms of order op(l). Consequently, joint weak convergence of 
{Vi^Ti V2,T, Vs^T, Wt) can be shown along the lines of the proof of Theorem 12.11 and the 
CMT yields the result. □ 
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4. Simulations 



To investigate the statistical properties of the proposed monitoring procedure we performed 
a simulation study. We used the following ARMA(1,1) simulation model. Suppose 

Yt+i = pYt + et- /3et_,, t = 1, 2, . . . , T = 250, 

where Yq = 0, {e^} is a sequence of independent A^(0, l)-distributed error terms, and 
p and (3 are parameters. We investigated the cases given by p = 1,0.98,0.95,0.9 and 
/3 = —0.8, 0.5, 0, 0.5, 0.8. Clearly, p = I corresponds to the unit root null hypothesis. For 
P = the innovation terms are uncorrelated corresponding to = 1. This simulation model 
was also used in Steland (2006), where a monitoring procedure based on the KPSS unit 
root test is studied in detail. Since part of the parameter settings used below are identical, 
the results of the present numerical study can be compared with the corresponding results 
in Steland (2006). 

To study the monitoring rules with estimated control limits critical values for a significance 
level of a = 5% were taken from the limit process defined in fll5p with estimated nuisance 
parameter. To down-weight past contributions a Gaussian kernel with bandwidth h = 25 
was used. The nuisance parameter was estimated by the Newey-West estimator at time 
point t with lag truncation parameter m chosen hy m = rrit = [4(t/100)^/^J , t = k, . . . , N. 
The start of monitoring, k, affects the properties and has to be chosen carefully. For the 
rule St we used k = 50, whereas for St a larger value, k = 75, yielded better results. 

To investigate the properties of the monitoring rule, we estimate empirical rejection rates 
of the test which rejects the unit root null hypothesis if the procedure gives a signal, the 
average delay, and the average conditional delay given a signal. For the detection rule 5"^ 
the ARL is defined by E{St) -k + l.We define the CARL as E{ST\k <ST<T)-k + l. 
The definitions for St are analogous. Note that the conditional delay is very informative 
under the alternative, since it informs us how quick the method reacts if it reacts at all. In 
the tables average delays are given in brackets and conditional delay in parentheses. 
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Table [T] provides the results for the monitoring procedures 5*^ and St using estimated 
control limits. The curves c('(9) were obtained by simulating from the limit laws. Overall, 5"^ 
performed well. The performance of the t-type procedure is disappointing. When inspecting 
the CARL values, the results seem to be mysterious. E.g. when comparing the CARL 
for p = 0.95 and p = 0.9 if /3 = 0, the procedure seems to misbehave. To explore the 
reason. Figure [1] provides a part of the distribution of St — k + 1. It can be seen that the 
percentage of simulated trajectories leading to immediate detection increases considerably, 
but the contribution of these cases to the calculation of the CARL is negligible. The 
other trajectories yielding a signal are hard to detect, and the signals are spread over the 
remaining time points with many late signals, which suffice to yield large CARL values. 
This fact shows that a single number as the CARL can not summarized the statistical 
behavior sufficiently. It highlights the benefit that the random walk null hypothesis can 
often be rejected very early. 

The simulation results for the control charts using transformed statistics are summarized 
in Table [2l Here we used exact control limits obtained by simulation using 20,000 repeti- 
tions. Comparing the transformation control statistics with these control limits yields quite 
accurate results if /3 = 0. The t-type version is preferable for /3 < 0. 

Comparing the methods 5*^ (using estimated control limits) and Zt (using transformed 
statistics), our results indicate that the more computer- intensive approach to use estimated 
control limits provides more accurate results. 
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Table 1 . Results for the weighted DF control chart with estimated control 
limits, St- 
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Table 2. Results for the transformed weighted DF control charts Zt and Zt- 
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Figure 1. Part of the distribution of 5't - A; + 1 for p = 0.95 (circles) and 
p — 0.9 (crosses). 
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Introduction 



Analyzing whether a time series is stationary or is a non-stationary random walk (unit root 
process) in the sense that the first order differences form a stationary series is an important 
issue in time series analysis, particularly in econometrics. Often the task is to test the unit 
root null hypothesis against the alternative of stationarity at a pre-specified a level, which 
ensures that a decision in favor of stationarity is statistically significant. For instance, 
the equilibrium analysis of macroeconomic variables as established by Granger (1981) and 
Engle and Granger (1987) defines an equilibrium of two random walks as the existence 
of stationary linear combination. When analyzing equilibrium errors of a cointegration 
relationship, rejection of the null hypothesis in favor of stationarity means that the decision 
to believe in a valid equilibrium is statistically justified at the pre-specified a level. For 
an approach where CUSUM based residual tests are employed to test the null hypothesis 
of cointegration, we refer to Xiao and Phillips (2002). Their test uses residuals calculated 
from the full sample. In the present article we study sequential monitoring procedures 
which aim at monitoring a time series until a time horizon T to detect stationarity as soon 
as possible. 

The question whether a time series is stationary or a random walk is also of considerable 
importance to choose a valid method when analyzing the series to detect trends. Such 
procedures usually assume stationarity, see Steland (2004, 2005a), Pawlak et al. (2004), 
Huskova (1999), Huskova and Slaby (2001), Ferger (1993, 1995), among others. As shown in 
Steland (2005b), when using Nadaraya- Watson type smoothers to detect drifts the limiting 
distributions for the random walk case differ substantially from the case of a stationary 
time series. 

To detect changes in a process or a misspecified model, a common approach originating 

in statistical quality control is to formulate an in-control model (null hypothesis) and an 

out-of-control model (alternative), and to apply appropriate control charts resp. stopping 

times. Given a time series ¥^,¥2, . . . a monitoring procedure with time horizon (maximum 

sample size) T is given by a stopping time = inf{l < t < T : Uf & A} using the 
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convention inf = oo, where Ut, called control statistic, is a a{Yi, . . . , y^)-measurable R- 
valued statistic sensitive for the alternatives of interest, and A C M is a measurable set 
such that {Ut G A} has small probability under the null model and high probability under 
the alternative of interest. In most cases A is of the form (—00, c) or (c, 00) for some given 
control limit (critical value) c. To design monitoring procedures, the standard approach is 
to choose the control hmit to ensure that the average run length (ARL), ARL — E{S^), 
is greater or equal to some pre-specified value. However, controlling the significance level 
is a also serious concern. The results presented in this article can be used to control any 
characteristic of interest, although we will focus on the type 1 error in the sequel. 

The (weighted) Dickey-Fuller control chart studied in this article is essentially based on a 
sequential version of the well-known Dickey-Puller (DF) unit root test, which is motivated 
by least squares. Due to its power properties this test is very popular, although it is known 
that its statistical properties strongly depend on a correct specification of the correlation 
structure of the innovation sequence. The DF test and its asymptotic properties, particu- 
larly its non-standard limit distribution have been studied by White (1958), Fuller (1976), 
Rao (1978, 1980), Dickey and Fuller (1979), and Evans and Savin (1981), Chan and Wei 
(1987, 1988), Phillips (1987), among others. We will generahze some of these results. To 
ensure quicker detection in case of a change to stationarity, we modify the DF statistic 
by introducing kernel weights to attach small weights to summands corresponding to past 
observations. We provide the asymptotic theory for the related Dickey-Fuller (DF type) 
processes and stopping times, also covering local-to-unity alternatives. 

For correlated error terms the asymptotic distribution of the DF test statistic, and hence 

the control limit of a monitoring procedure, depends on a nuisance parameter, which can 

be estimated by Newey-West type estimators. We consider two approaches to deal with 

that problem. Firstly, based on a consistent estimate of the nuisance parameter one may 

take the asymptotic control limit corresponding to the estimated value. Secondly, following 

Phillips (1987) one may consider appropriate transformations of the processes possessing 

limit distributions which no longer dependent on the nuisance parameter. A nonparametric 
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approach called KPSS test which avoids this problem, at least for 1(1) processes, has been 
proposed by Kwiatkowski et al. (1992). That unit root test has better type I error accuracy, 
but tends to be less powerful. Monitoring procedures related to this approach and their 
merits have been studied in detail in Steland (2006). 

The organisation of the paper is as follows. In Section [1] we explain and motivate carefully 
our assumptions on the time series model, and present the class of Dickey-Fuller type 
processes and related stopping times. The asymptotic distribution theory under the null 
hypothesis of a random walk is provided in Section [21 Section |3] studies local-to-unity 
asymptotics, where the asymptotic distribution is driven by an Ornstein-Uhlenbeck process 
instead of the Brownian motion appearing in the unit root case. Finally, in Section H] we 
compare the methods by simulations. 

1. Model, assumptions, and Dickey-Fuller type processes and control 

CHARTS 

1.1. Time series model. Our results work under quite general nonparametric assump- 
tions allowing for dependencies and conditional heteroskedasticity (GARCH effects), thus 
providing a nonparametric view on the parametrically motivated approach. To motivate 
our assumptions, let us consider the following common time series model, which is often 
used in apphcations. Suppose at this end that {Yt} is an AR(p) time series, i.e., 

Yt = aiYt-i H h apYt^p + ut, 

for starting values Y_p, . . . , Y_i, where {ut} are i.i.d. error terms (innovations) with E{ut) = 
and al = Var (ut), < < oo. Assume the characteristic polynomial 

p{z) = 1 — aiz — • • • — apZ^, z G C, 

has a unit root, i.e., p{l) = 0, of multiplicity 1, and all other roots are outside the unit 
circle, i.e., p{z) = implies \z\ > 1. Then p{z) = p*{z){l — z) for some polynomial p*{z) 
with has no roots in the unit circle implying that l/p*{z) exists for all \z\ < 1. We obtain 
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p{L) = p*{L)AYt = et, where L denotes the lag operator. Since p*{L) can be inverted, we 
have the representation 

for coefficients {Pj}- This means, {1*} satisfies an AR(1) model with correlated errors. For 
the calculation of (3j we refer to Brockwell and Davis (1991, Sec. 3.3.) In particular, to 
analyze an AR(p) series for a unit root, one can work with an AR(1) model with correlated 
errors. 

The representation ([1]) motivates the following time series framework which will be assumed 
in the sequel. Suppose we are given an univariate time series {Yt : t = 0,1, . . .} satisfying 

(2) Yt = pYt_, + et, t>l, Yo = 0, 

where p G (—1, 1] is a fixed but unknown parameter. Concerning the error terms {e^} we 
impose the following assumptions. 

(El) {e^} is a strictly stationary series with mean zero and i?|ei|^ < oo with the following 
properties: We have 

oo 

^Cov{el,el^t) < oo, 
t=i 

and both {et} and {e^} satisfy a functional central limit theorem, i.e., 

(3) J2 e.^vB{s), 

i< [Ts] 

and 

(4) {e^-E4)^iB^^\s), 

i< [Ts] 

as T — 7- oo, for constants < r],r]' < oo. Here B and 5^^^ denote (standard) 

Brownian motions (Wiener processes) with start in 0. 
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(E2) {et} is a strong mixing strictly stationary times series with £'|ei|^(-'^+'') < oo for 
some S > 0, and with mixing coefficients, a{k), satisfying 

oo 

< oo. 

fe=l 

In assumption (El) and the rest of the paper =^ denotes weak convergence in the space 
D[0, 1] of all cadlag functions equipped with the Skorokhod metric d. 

Remark 1.1. The assumption that {et} satisfies an invariance principle can be regarded 
as a nonparametric definition of the /(O) property ensuring that the partial sums converge 
weakly to a (scaled) Brownian motion B. For a parametrically oriented definition see Stock 
(1994)- Particularly, the scale parameter r] is given by 

T 

(5) 77' = lim Vt, ?7r = + 2 V(r - t)T-^E{eiei+t) 

T^oo — ' 

t=l 

Also introduce the notations 

(6) = ?7T/<7^ lim ^T- 

T-)-oo 

// the Et are uncorrelated, we have rj^ — a'^ , and — 1. 

As a non-trivial example for processes satisfying (El) let us consider ARCH processes. 

Example 1.1. A time series {Xt} satisfies ARCH(oo) equations, if there exists a sequence 
of i.i.d. non-negative random variables, {^t}, such that 

00 

Xt ^ PtCt, Pi = a + bjXt-j 

where a > 0, bj > 0, j — 1,2, .. . This model is often applied to model conditional het- 
eroscedasticity of an uncorrelated sequence {et} with Eet = for all t, by putting Xt = ef. 
A common choice for ^t is to assume that the are i.i.d. with common standard nor- 
mal distribution. In Giraitis et al. (2003) it has been shown that an unique and strictly 
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stationary solution exists and satisfies J2k C'oi'(Xi, Xi+fc) < oo, if 

oo 

In addition, under these conditions the functional central limit theorem ^ holds. The rate 
of decay of the coefficients bj controls the asymptotic behavior of Cov{Xi,Xi^k)- If for 
some 7 > 1 and c > we have bj < cj~^ , j = 1,2, ... , then there exists C > such that 
Cov{Xi, Xi+k) < Ck^'^ for k > 1. Thus, depending on the rate of decay (E2) may also 
holds. 

Remark 1.2. Assumption (E2) will be used to verify a tightness criterion. Combined with 
appropriate moment conditions it implies the invariance principles (EP and 

1.2. Dickey-Puller processes. We will now introduce the class of Dickey-Fuller processes 
and related detection procedures. Recall that the least squares estimator of the parameter 
p in model ([2]) is given by 

T T 

t=l t=l 

To test the null hypothesis Hq : p = 1, one forms the Dickey- Fuller (DF) test statistic 

= T{pT - 1) = ^^=' ~ ^'-'^ , 

2^t=l ^t^l 

Suppose at this point that the are uncorrelated. Provided \p\ < 1, < J2t=i ^t-i \ (Pt ~ 
1) -4 A/'(0, 1), as T — i- oo. However, />r has a different convergence rate and a non-normal 
limit distribution, if p = 1. It is known that 

Dr^V^ = (l/2)(i?(l)2 " / ^^''^'"^^^ 
as T ^ oo, see White (1958), Fuller (1976), Rao (1978, 1980), Dickey and Fuller (1979), 
and Evans and Savin (1981). Recall that B denotes standard Brownian motion. Based on 
that result one can construct a statistical level a test, which rejects the null hypothesis 
Hq : p = 1 of a unit root against the alternative ifi : p < 1 if < c, where the 
critical value c is the a-quantile of the distribution of Vi. More generally, we want to 
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construct a detection rule which provides a signal if there is some change-point q such that 
li, . . . , Fg-i form a random walk (unit root process), and Yq, . . . ,Yt form an AR{\) with 
dependent innovations. This means, the alternative hypothesis is Hi = [Ji<q<TH^\ where 
H'f \ 1 < q <T, specifies that 



Yt-i + et, l<t<q, 
pYt-i + et, q<t<T, 

where p E (—1,1). However, for the calculation of the detection rule to be introduced now 
knowledge of a specific alternative hypothesis is not required. 

A naive approach to monitor a time series to check for deviations from the unit root hy- 
pothesis is to apply the DF statistic at each time point using the most recent observations. 
A more sophisticated version of this idea is to modify the DF statistic to ensure that 
summands in the numerator have small weight if their time distance to the current time 
point is large. To define such a detection rule, let us introduce the following sequential 
kernel-weighted Dickey-Fuller (DF) process 

where AF^ — Yt — Yt-i. Here and in the following we put 0/0 = for convenience. Note 
that \Ts\ plays the role of the current time point. The non-negative smoothing kernel K 
is used to attach smaller weights to summands from the distant past to avoid that such 
summands dominate the sum. Thus, kernels ensuring that z i^d^l), 2; G M, is decreasing 
are appropriate, but that property is not required. We do not use kernel weights in the 
denominator, since it is used to estimate a nuisance parameter. We will require the following 
regularity conditions for X : R — >■ Rq . 

(Kl) lli^lloo < 00, / K{z)dz = 1 and / zK{z)dz = 0. 
(K2) K is with bounded derivative. 
(K3) K has bounded variation. 

Note that it is not required to use a kernel with compact support. 



The parameter h = hx is used as a scaling constant in the kernel and defines the memory 
of the procedure. For instance, if K{z) > if z G [—1, 1] and K{z) = otherwise, the 
process Ut looks back h observations. We will assume that 

(8) T/hr ^ C, T ^ oo, 

for some 1 < ( < oo. That condition ensures that the number of observations used by Dj- 
gets larger as T increases. Note that the parameter (, which will also appear in the limit 
distributions, could be absorbed into the kernel K. However, in practice one usually fixes 
a kernel K and chooses a bandwidth h relative to the time horizon T. ([8]) is therefore not 
restrictive. 

1.3. Dickey-Puller type control charts. Since small values of Dt{s) provide evidence 
for the alternative that the time series is stationary, intuition suggests that the control 
chart should give a signal if Dt is smaller than a specified control limit c. Hence, we define 

St = St{c) = M{k <t<T: Drit/T) < c}, inf = oo. 

We will assume that the start of monitoring, k, is given by 

k = [Tk\ , for some n G (0, 1). 

A reasonable approach to choose c is to control the type I error rate a G (0, 1), i.e., to 
ensure that 

(9) limPo(5'r(c)<r)=«, 

T— >oo 

where Pq indicates that the probability is calculated assuming that {Yt} is a random walk 
corresponding to the null hypothesis Hq : p = 1. 

1.4. DF control chart with estimated control limit. In the next section we will 
show that Dt converges weakly to some stochastic process depending on the nuisance 
parameter 

= lim = Ti/a, 
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and that S't/T converges in distribution to inf{s G [k, 1] : V^i^s) < c}. Hence, if c is chosen 
from the asymptotic distribution via ([9]), c = c('i9) is a function of i}. Therefore, the basic 
idea is to estimate ■(9 at each time point using only past and current data, and to use the 
corresponding hmit. 

Our estimator for ^ will be based on a Newey-West type estimator, thus circumventing 
the problem to specify the short memory dynamics of the process explicitly. Let 7(fc) = 
E{etet+k) and denote by r{k) = 'y{k)/ E{ef), k e N, the autocorrelation function of the 
time series {et}. Since et = AYf if p = 1, we can estimate 7(/c) and r{k) under the null 
hypothesis by 

t t 

(10) n{k)=%{k)/al 7t(A;) = t''$^AnAy;_,, d^ = t-'J2^^s- 

s=k s=l 

The parameter -(9^ can now be estimated by the Newey-West estimator given by 

m 

(11) ^t=Vt/^l vl = a^t+2Y,w{m,i)^l{i), 

1=1 

where iy(m, i) = [m — i)/m are the Bartlett weights and m is a lag truncation parameter, 
see Newey and West (1987). Andrews (1991) studies more general weighting functions and 
shows that the rate m = o(T^/^) is sufficient for consistency. 

The Dickey-Fuller control chart for correlated time series works now as follows. At each 
time point t we estimate by "^t and calculate the corresponding estimated control limit 
c{'dt)- A signal is given if Dj- is less than the estimated control limit, i.e., we use the rule 

St = mf{k <t<T: Drit/T) < c(^t)}. 

1.5. DF control chart based on a transformation. Alternatively, one may use a trans- 
formation of Dt, namely 



2[Tsi 



(12) ETis) = DTis)+ _ ,_2^iT.i..o ' SG(0,1]. 



It seems that this transformation idea dates back to Phillips (1987). We will show that for 
arbitrary -d the process Et converges weakly to the limit of Dt for d = 1. Consequently, 
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if c denotes the control limit ensuring that St has size a when -& — 1, then the detection 
rule 

Zt = ini{k <t<T: ET{t/T) < c} 
has asymptotic size a for any {}. 

In the next section we shall show that both procedures are asymptotically valid. 

1.6. Extensions to Dickey- Fuller processes. Inference on the AR parameter in the 

unit root case is often based on the t-statistic associated with Dt, which gives rise to Dickey- 
Fuller t-processes. The Dickey-Fuller t-statistic, top, associated with Dt = T{'pT — 1), is 
the standard computer output quantity when running a regression of Yt on Yt-i. For a 
sample Yi, . . . , Yt, the statistic toF defined as 

toF = {PT - !)/& = T{pT - l)/(r&) 

where 

{T >, 1/2 

with 4 = (T- 1)-' Ef-iff - PTY,.,y-. 

The formula for toF motivates to scale Dt analogously. Hence, let us define the weighted 
t-type DF process by 

(13) Dt{s) = DT{s)/{[Ts\iiTsi), s e (0, 1], 

and Dt{0) = 0. Dt{s) is a weighted version of tr)F calculated using the observations 
Yi, . . . , Y^Ts\, and attaching kernel weights K{{lTs\ — t)/h) to the tih. summand in the 
numerator. The associated detection rule for known '& is defined as 

St = St{c) = inf{A; <t<T: DT{t/T) < c(i?)} 

with c(i?) such that limT^,^ -Po(§r(c(^?)) < T) = a. 
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Again, it turns out that the asymptotic hmit of Dt depends on the nuisance parameter d. 
The weighted t-type DF control chart with estimated control limits is defined as 

St = ini{k <t<T: Drik/T) < c(^t)}. 

Alternatively, one can transform the process to achieve that the asymptotic limit is invari- 
ant with respect to We define 

(14) Et{s) = ^^Dt{s) '-^ , = , s G (0, 1]. 

We will show that the detection rule 

Zt = mf{k <t<T : Erit/T) < c(l)} 
has asymptotic type I error equal to a for all ^9. 



2. Asymptotic results for random walks 



In this section we provide functional central limit theorems for the Dickey-Fuller processes 
defined in the previous section under a random walk model assumption corresponding to 
the null hypothesis Hq : p = 1 in model ([2]), and the related central limit theorem for 
the associated stopping rules. These results can be used to design tests and detection 
procedures having well-defined statistical properties under the null hypothesis. 

2.1. Weighted Dickey- Puller processes. We start with the following functional central 
limit theorem providing the limit distribution of the weighted DF process Dt{s), s G [0, 1], 
which extends Phillips (1987, Th. 3.1 c). 

Theorem 2.1. Assume the time series {Yt} satisfies model ^ with p = 1 such that (El) 
and (K1)-(K3) hold. Then 
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as T — >■ OO; where the stochastic process 

{K{0)B{sy + C /; B{rfK\as r j) dr - Kids - r)) dr} 



(15) V^(s) = 



j;B^{r)dr 



s e (0, 1]; 'D^(O) = 0; is continuous w.p. 1. 



Remark 2.1. Note that the asymptotic limit is distribution-free if and only if r] = a 
which holds if the error terms are uncorrelated. Otherwise, the distribution of depends 
sensitively on 

Proof. If p = 1 we have = AY^ and Yt-iet = (1/2) (Y^^ - Y^_^ - ej) for all t. This yields 
the representation 

^ / X Vt(s) - Rt(s) , , 

Wt[s) 

where the /^[O, l]-valued stochastic processes Vr, -Rr, and Wt are given by 

VTs\ 

Vt{s) = {2[Ts\)-'J2iYt'-yt-i)K{{[Ts\-t)/h), 

t=i 

[Ts\ 



Rt(s) = (2lTs\)-'J2^tmiTs\-t)/h), 



t=i 



lTs\ 



t=l 



for s G (0, 1]. Let us first show that 
(16) 

as T — >■ oo, where 



sup I-Rt(s) — A*(s)| — >■ 0, 

s6[k,1] 



Ms) = ^^ K{C{s-r))dr, se(0,l]. 



Consider 



\E{Rt{s)) - t,{s)\ ^ 



a 



^ 5^ ir(( [Ts\ - t)/h) -s-'j^ Kids -r))dr 
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([8]) ensures that sup^g[^ maxj [([TsJ — i)/h — C(s — i/T)\ = o(l) yielding 



[Ts] 



[Ts] 



K{Cis-r))dr + o{l), 



uniformly in s G [n, 1], because K is Lipschitz continuous and of bounded variation, cf. The- 
orem 3.3(ii) of Steland (2004). It remains to estimate \Rt{s) — E{Rt{s))\. The assumptions 
on {et} ensure that 

Zr{r)=T-^/^Y.^el-Eel)^pB^-\r) 

i=l 

as T — )■ oo, where = Var(ef) + 2 Gov (ef , ef_,_j). Hence, eventually for equivalent 
versions, we may assume that \\Zt — pB^'^^\\oo a.s., for T — i- oo. By (K3) the Stieltjes 
integrals K{C{s-r)) dB^'^\r) and K{C{s-r)) dZrir) are well defined (via integration 
by parts), and 



sup 

s&[k,1] 



K{C{s -r))dZT{r) - p / K{C{s - r)) dB'^'^\r) 



a.s. „ 

0, 



as T — )• oo. Obviously, 



sup \Rt{s) - ERt{s)\ = sup ^— 

s&[k,1] se[K,l] l-L sj 



K{{[Ts\-[Tr\)/h)dZTir) 



pVT 

< sup — — 



B..(.);,-[E£j_ErJ]|^:; 



B^^\r)K{{[Ts\ - [T{dr)\)/h) 



+ sup — — 



[ZT{r)-pB^'\r)]K{{[Ts\ - VTr\)/h)\Zl 



- / [ZT{r)-pB^'\T)]K{{[Ts\-[ndT)\)lh) 
Jo 
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Noting that the total variation of the functions r K{[\Ts\ — \ Tr\]/h), s G [k, 1], T > 1, 
is uniformly bounded, the right side of the above display can be estimated by 

O (^||5(^)lU||ir|u) +0 (^||5(^)|U/ \dK\^ 

+0 (^II^T - pi?(^)l|oo||i^||oo) +0(^^\\Zr- pB^'^W^I \dK\^ 
= Opil/VT) = op(l). 

Therefore, flTB]) holds true. Let us now consider Vr- We will first show that, up to terms of 
order op(l), Vr is a functional of 

UT{r)=T-'/^Y^Tr], re [0,1]. 

Again, under the assumptions of the theorem, Ut converges weakly to 1]B, where B denotes 
Brownian motion and > is a constant. For brevity of notation let 

kT{r;s)=K{{[Ts\ - [Tr\)/h), r, s G [0,1]. 

Integration by parts yields 

1 L^sJ 

Vt{s) = ^^^^{Y,'-Yl,)K{i[Ts\-t)/h) 



T 



2[Ts\ 
T 



2[Ts\ 



kT{r;s)UUr) 



- / U^{r)kT{dr;s] 
=0 Jo 



2s 



r=0 



+ 7r / U^ir)K'{as-r))dr + op{l] 
Jo 



v'K{0)B\s) , C 



+ UUr)K'{as - r)) dr + op(l). 



2s 

Due to (K2) the Op(l) term is uniform in s G [n, 1]. Next note that 

We are now in a position to verify joint weak convergence of numerator and denominator 
of Dt- The Lipschitz continuity of K ensures that up to terms of order op(l) for all 
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(Ai, A2) G the linear combination XiiVxis) — Rt{s)) + X2Wt{s) is a functional of Ut, 
and that functional is continuous. Therefore, the continuous mapping theorem (CMT) 
entails weak convergence to the stochastic process 



Ai 



2. ■ ^J^^Kms-r))B^r)dr 



K{Cis-r))dr 







2 fs 

/ B{rfdr. 
■s Jo 

This verifies joint weak convergence of {Vt — RtiWt)- Hence, the result follows by the 
CMT. (K2) also ensures that G C[0, 1] w.p. 1. □ 

The central limit theorem (CLT) for the detection procedure St, which requires knowledge 
of T?, appears as a corollary. 

Corollary 2.1. Under the assumptions of Theorem \2.1\ we have for any control limit c < 

St/T 4 inf{s G [/€, 1] : V^{s) < c] 
as T ^ 00, where T>^{s) is defined in ( f73]) . 

Proof. Observe that by definition of 5"^ 

St > X ^ inf Dt{s) > sup -Dt{s) < -c 
for any x G M. Hence it suffices to show that 

P{ sup -Dt{s) < -c) P{ sup -V4s) < -c), 

s(I^[k,x] s£[k,x] 

where denotes the limit process given in Theorem 12.11 Using the Skorokhod/Dudley/ 
Wichura representation theorem and a result due to Lifshits (1982), this fact can be shown 
along the lines of the proof of Theorem 4.1 in Steland (2004), if c < 0, since G C[0, 1] 
a.s. For brevity we omit the details. □ 

Let us now show consistency of the detection procedure St = 'mf{k < t < T : DT{t/T) < 
c('i9t)}, which uses estimated control limits. 
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Theorem 2.2. Assume (El) and (E2), (K1)-(K3), and in addition that the lag truncation 
parameter, m, of the Newey- West estimator satisfies 

m — I — >■ oo. 

Then the weighted Dickey-Fuller type control chart with estimated control limit, St, is 
consistent, i.e., 

P(St <T)^a, 

as T ^ oo. 

Proof. Note that the equivalence St > T ^ mis^[K,i] -Dr(s)/c('t?LTsj) > 1 imphes 

(17) P{St <T) = p( inf -^M- < 1 

Let us first show that the function c is continuous. Note that the process V^{s) can be 
written as £{s) ~'d^'^J^{s) for a.s. continuous processes £ and not depending on {}, where 
particularly J-{0) — and 

T{s) = {s/2) K{C{s-r))dr/ B^{r)dr, s G (0, 1]. 
Jo Jo 

Let {■»?*, ■j?„:n>l}cMbea sequence with — >■ as n — >■ oo. Clearly, for each u of 
the underlying probability space with [^(ix')], l^l^^)! < oo, we have 

n ^ oo. Hence, sup^gf^^ij P^„(s) 4 sup^g[„^i] as n oo. Since sup,g[^^i] 

has a continuous density, this is equivalent to pointwise convergence of the d.f. Fn{z) — 
^(sup,6[^,i]2^^„(s) < z) to F{z) = P(sup,g[^_i]r>^*(s) < z), as n ^ oo, for all z e M. 
Hence, 

c{^r^) = F-\a) ^ F-\a) = c(r), 

as n — >■ oo. Next we show 

(18) d\Ts\ ^, 
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as T — )■ oo, in 1]. Since for each s G [k, 1] we have i!^its\ ~^ ^, for T — > oo, fidi 
convergence follows immediately. It remains to verify tightness. Recall the definitions f lTU]) 
and (ITTi) and that AFj = ej under i^o- Fix j and consider the process 7[Tsj(j), s G [k, 1], 
which is a functional of {ete^^j : t = j,j + l, . . .}. Clearly, by the Cauchy-Schwarz inequality 
and (El) E\etet-j\'^'^^ < E\et\'^'^'^^ < oo for some 6 > 0. Further, since J^i^o = (^{^s^s-j '■ s < 
t) C = cr(e, : s <t) and J'^,^ = a{eses-j : s > t + k) C J^^^, = a{es : s>t + k-j), 
the mixing coefficients 5(/c) of {etet-j} satisfy 

a{k) = sup _ sup ^ \P{AnB) - P{A)P{B)\ 

< sup sup \P{AnB) - P{A)P{B)\ = a{k- j), 

where {a{k)} are the mixing coefficients of {et}. Due to (El) we can apply Yokohama 
(1980, Th.l) with r = 2 + 25 to conclude that for k < r < s < 1 



E 



lTs\ 
t=[Tr i+1 



2+25 

1+S\ 



0{\s-r\ 



Now the decomposition 

E 

t=[TrJ+l 

and the triangle inequality yield 



T 1 ^^'^ f T T \ 1 ^^"^ 



l^illTsiU) - 7LT.j(j))ll2+2. = 0(s-i|s - rr+^)/(2+2^)) + 0(|1/. - l/r|r(i+^)/(2+2^)) 



0(|s-r 



(l+5)/(2+25)^ 



since, firstly, we may assume < 5 < 1, and, secondly, both s ^ and ^ <5)/(2+25) 
bounded away from and oo for < r < s < 1. Consequently, 
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and therefore Vaart and Wellner (1986, Ex. 2.2.3) implies tightness of the process {s/T^yt-i (j) 
s G [k, 1]} for fixed j > 0. Note that 7[tsj(0) = cr^rsj- -^^ triangle inequality we have 

m 

l|v^(%TsJ -%Trj)||2+25 < 2^{l - j / mf^^^\a\Ts\{3) - lVTr\{j)h+2& 

= 0(m|s-r|(i+^)/(2+^)), 

yielding 

E\viTsi - VlTri = Oiim/T'/'r''\s - rr+^). 

Hence, { [TsJ , S'^^^j ) : s G [k, 1]} is tight in the product space, which implies weak conver- 
gence of {^9[Tsj : s G [ft;, 1]} to The final step is to verify 

(19) inf D^(s)/c(V,j) 4 inf P^(s)/c(t9), 

Sg[K,l] sG[k,1] 

as T — )■ oo, since this implies that ( IT7|) converges to P(infsg[K,i] (s) < c{{})) = a, as 
T — )■ oo. Due to ffTSl) we can conclude that 



(Dt(-),Vj)^(^^^(-),^) 

in the product space {D[k., 1])^. Note that the mapping 99 : {D[k,, 1], (i)^ — ^ (M, given by 

ip{x,y)= inf — — --, x,y e D[k,1], y eR, 
s&[k,i] c{y{s)) 

is continuous in all {x,y) G (C[k, 1])^. Since G C[0, 1] w.p. 1 and c G C(M), ( 1X9]) 

follows. □ 

It remains to provide the related weak convergence results for the transformed process Et 
and its natural detection rule Zt = iiaf{k < t < T : Exit/T) < c}. 

Theorem 2.3. Assume (E1),(E2), and (K1)-(K3). Additionally assume that the lag trun- 
cation parameter, m, of the Newey- West estimator satisfies 

Then, 

Et{s)^Vi{s), m{D[K,l],d) 
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as T — >■ OO; and for the transformed Dickey- Fuller type control chart we have 

Zt/T 4 M{k <t<l: Vi{t) < c}. 
as T — >■ oo. Particularly, the asymptotic distributions are invariant with respect to 



Proof. As shown above, 

-2 ^ „2 „„ 1 -2 ^ _2 



as T ^ OO, which imphes that 

\Ts\ \ 



t=i 

if T ^ oo, yielding 



LT^J-ES^^i. 

^ . ^ , <j'-v's-^J^K{C{s-r))dr 
+ s-^J^B^ir)dr 



□ 



2.2. Weighted Dickey-Fuller t-processes. Let us now derive (functional) central hmit 
theorems for the weighted Dickey-Fuller t-processes and the associated detection rules. We 
start with the process Dt under the random walk null hypothesis. 

Theorem 2.4. Assume (El), and (K1)-(K3). Then 

br^Vi), in{D[K,l],d) 

as T ^ oo, where 

~ i {^K{0)B{sf + /; BirfK'iCis - r)) dr - i^(C(s - r)) dr} 

L'MS) — —p: 

U^Birydry^' 

for s e (0, 1] and 'D^(O) = 0. Here — rj/a. is continuous a.s. 

Remark 2.2. Note that again the limit depends on the nuisance parameter and is 
distribution-free if and only if "d = 1. 
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Proof. By definition 



where 



Dt{s) 



Dt{s) 

[Ts\'ilTs\ 



q2 



with 



[Ts] 



C2 _ 



J t=i 

for s e (0, 1]. Note that for t = 1, . . . , [Ts\ 

e,(LTsJ)-e, = -(pLTsj-l)V;-i. 

Hence, we obtain 



r.2 



[Ts\ 



From the proof of Theorem 12.11 we know that 



[Ts\ 

sup [Ts\ V = sup 
se(o,i] ^*G(o,i] 



T 



f\T-^'^Y\Tr\fdr = Op{l) 
Jo 



and 



sup 

sG(0,l] 



t=i 



< sup 

sG{0,l] 



LTsJ 

t=i 



sup ILTsJ-^/'FltsjI =Op(l). 
se(o,i] 



Combining these facts with sup^g^g,!] L^-^J \PITs\ ~ 1| = Op{l), we obtain 

[Ts] 

t=i 

where the op(l) term is uniform in s G (0, 1]. Because (El) imphes that 

72(A;) = Gov (4, 4^^) = o(l), \k\ ^ oc, 
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we may apply the law of large numbers for time series (Brockwell and Davis (1991), Th. 
7.1.1) and obtain, since stochastic convergence to a constant yields stochastic convergence 
in the Skorokhod topology. 



(20) rf(^f^„j,a2)4 0. 



as T — )■ oo. We shall now show joint weak convergence of {Dt{s), S?j.^,, [TsJ ^ X]l=i^ 



s G (0, 1]. Let (Ai, A2, A3) G - {0} and consider 



lTs\ 



t=i 



The proof of Theorem 12.11 implies that 

X,DT{s) + X,[Ts\-'y2Yl,^X,V4s) + X,^ / B{rY dr, 
t=i ^ -^0 

as T —7- 00. Due to fl2UI) . we obtain 

X,DTis) + X2SlTsi+hlTs\-'y^Yl,^X^V4s) + X2a' + X,^ / B{rf dr, 

t=i * 

as T — )■ 00. Therefore, the CMT implies that 



^2 



and 



Dt(s) = , = 



as T — )■ 00, yielding the assertion. □ 

We are now in the position to establish consistency of the t-type detection rule 

St = inf{fc <t<T: brit/T) < c0t)}, 

which uses estimated control limits. Notice that Theorem 12.41 implies that c{d) is given by 
Po(inf,g[K,i]:Ptf(s) < c{'d)) = a. 

22 



Theorem 2.5. Assume (E1),(E2), (K1)-(K3), and additionally that the lag truncation 
parameter of the Newey- West estimator satisfies 

m = o{T^^^), T ^ oo. 

Then the t-type weighted Dickey-Fuller control chart with estimated control limits, St, is 
consistent, i.e., 

P{St <T)^a, 

as T —7- oo. 

Proof. The resuh is shown along the lines of the proof of Theorem 12.21 since the process 
is continuous w.p. 1, and is a continuous function of ^. □ 

Finally, for the transformed process Et and the associated control chart Zt we have the 
following result. 

Theorem 2.6. Assume (E1),(E2), (K1)-(K3), and 

m = o{T^'^), T ^ oo. 
Then the transformed t-type weighted DF process E^, defined in [T^ , converges weakly, 

Et^Vi, in {D[0,l],d), 
as T ^ oo, and for the transformed t-type weighted DF control chart we have 

Zt/T 4 inf{K < s <l:Vi{s) < c}. 
Particularly, the asymptotic distribution is invariant with respect to 'd. 

Proof. Note that the first term of Et converges weakly to d~^D^., which has the form 
[A{s) — -i?^^ K{C^{s — r)) dr]/[j^ B'^{r) dr]^/'^. Hence, the construction of the correction 
term is as for Et- □ 
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3. ASYMPTOTICS UNDER LOCAL-TO-UNITY ALTERNATIVES 



In econometric applications, the stationary alternatives of interest are often of the form 
< p < 1 with 1 — p small. To mimic this situation asymptotically, we consider a local-to- 
unity model where the AR parameter depends on T and tends to 1, as the time horizon T 
increases. 

The functional central limit theorem given below shows that the asymptotic distribution 
under local-to-unity alternatives is also affected by the nuisance parameter i). However, 
the term which depends on the parameter parameterising the local alternative does not 
depend on (or 77). Therefore, if one takes the nuisance parameter -(9 into account when 
designing a detection procedure, we obtain local asymptotic power. 

Let us assume that we are given an array {Yr^t} = {Yr^t : 1 < t < T, T G N} of observations 
satisfying 

(21) Yt,o = 0, YT,t = PTYT,t-i + eu t = 1, . . . , T, T > 1, 
where the sequence of AR parameters {pt} is given by 

Pt = 1 + a/T, T > 1, 

for some constant a. {et} is a mean-zero stationary 1(0) process satisfying (El). For brevity 
of notation Dt denotes in this section the process ([7]) with Yt replaced by 1^,*- 

The limit distribution will be driven by an Ornstein-Uhlenbeck process. Recall that the 
Ornstein-Uhlenbeck process Za with parameter a is defined by 

(22) Z,is)= r e'^^'-'UBir), sE [0,1], 

Jo 

where B denotes Brownian motion. 

Theorem 3.1. Assume (El), and (K1)-(K3). Under the local-to-unity model l[21\} we have 
for the weighted Dickey-Fuller process 

Dt{s)^VI{s), 
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as T CO, where the a.s. C[0, l]-valued process is given by 

Km^js) + C /; Zl{r)K'{as - r)) dr - 2a Z!{r)K{as - r)) dr - Kjqs - r)) dr 

{2/s)^Zl{r)dr 

for s G (0, 1], and 'D'^iQ) = 0. Here denotes the Ornstein-Uhlenbeck process defined in 
Further, 

St/T 4 inf{s G [k, 1] : V^s) < c}, as T oo. 
Proof. The crucial arguments to obtain joint weak convergence of numerator and denomi- 



nator of Ut have been given in detail in the proof of Theorem 12.11 Therefore, we give only 
a sketch of the proof stressing the essential differences. First, note that 

LTrJ 

Ut(s)=T-'/'Yt,its\= I eT(r;s)dST(r), Srir) = T'^'^Yet 



-^0 t=i 



for the step function eT(r; s) = (1+a/T)^'^'^^^^'^^^ , r, s G [0, 1], which has uniformly bounded 
variation and converges uniformly in r, s to the exponential e(r; s) = e"^''"''^ Hence, firstly, 
the stochastic Stieltjes integral crir; s) dSrir) exists (via integration by parts), and, 
secondly, by estimating the terms of the decomposition JJ* ct dSx — Jq ed{rjB) = /Q*(eT — 
e) dijjB) + Ct di^Sr — rjB) we see that 



Ut{s) 



PS PS 

/ exir; s) dSrir) ^ 7] / e{r] s) dB{r) = r]Za{s) 
Jo Jo 



as T — 7- oo. Next, note that in the local-to-unity model we have 

yT,t-iet = ^{Yl, - Yl,_, + (1 - p%)Yl,_^ - e\) 
for all 1 < t < T. This yields the decomposition 

3 

^r(s) = J]V^.,T(s)/Wr(s) 

i=\ 
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where for s G (0, 1] 

V2AS) 



[Ts] 



^ J2'^S,t-iKii[Ts\-t)/h), 



2pt[Ts\ 



1 LTsj 

VsAs) = -7T-J7fT-Tj2'tK{{[Ts\~t)/h) 



Wt{s) 



2pt[Ts\ ^ 

n I 2 ^T,t-V 



The term Vi^t can be treated as in the proof of Theorem 12.11 namely, 



1 

2s 



/ Ul{r)K\C{s-r))dr + op{l] 
Jo 

From the proof of Theorem 12. II we know that due to (El) 



sup 



^3,t(s) + - / K{as-r))dr 



^0, 



as T — )■ 00. Consider now V2,t- By definition of pr we obtain 
1-Pt 1 



2,T 



-2a - aVT 1 
2(l + a/r) TLTsJ ^ 



J2'^lt-iKii[Ts\-t)/h), 



= -HsW Zl{r)K{as~r))dr+'^K{Q)Zl{s) + Op{l), 

where due to (K2) the op(l) term is uniform in s G (0,1]. Hence, Vi^t, ^2,T5 and Wt 
are functionals of Ut up to terms of order op(l). Consequently, joint weak convergence of 
{Vi^Ti V2,T, Vs^T, Wt) can be shown along the lines of the proof of Theorem 12.11 and the 
CMT yields the result. □ 
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4. Simulations 



To investigate the statistical properties of the proposed monitoring procedure we performed 
a simulation study. We used the following ARMA(1,1) simulation model. Suppose 

Yt+i = pYt + et- Pet-i, t = 1, 2, . . . , T = 250, 

where Yq = 0, {e^} is a sequence of independent A^(0, l)-distributed error terms, and 
p and (3 are parameters. We investigated the cases given by p = 1,0.98,0.95,0.9 and 
/3 = —0.8, 0.5, 0, 0.5, 0.8. Clearly, p = I corresponds to the unit root null hypothesis. For 
P = the innovation terms are uncorrelated corresponding to = 1. This simulation model 
was also used in Steland (2006), where a monitoring procedure based on the KPSS unit 
root test is studied in detail. Since part of the parameter settings used below are identical, 
the results of the present numerical study can be compared with the corresponding results 
in Steland (2006). 

To study the monitoring rules with estimated control limits critical values for a significance 
level of a = 5% were taken from the limit process defined in f|T5|) with estimated nuisance 
parameter. To down-weight past contributions a Gaussian kernel with bandwidth h = 25 
was used. The nuisance parameter i) was estimated by the Newey-West estimator at time 
point t with lag truncation parameter m chosen hy m = rrit = 

[4(t/100)i/^J, t = fc,...,Ar. 

The start of monitoring, k, affects the properties and has to be chosen carefully. For the 
rule St we used k = 50, whereas for St a larger value, k = 75, yielded better results. 

To investigate the properties of the monitoring rule, we estimate empirical rejection rates 
of the test which rejects the unit root null hypothesis if the procedure gives a signal, the 
average delay, and the average conditional delay given a signal. For the detection rule 5"^ 
the ARL is defined by E{St) -k + l.We define the CARL as E{ST\k <ST<T)-k + l. 
The definitions for St are analogous. Note that the conditional delay is very informative 
under the alternative, since it informs us how quick the method reacts if it reacts at all. In 
the tables average delays are given in brackets and conditional delay in parentheses. 
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Table [T] provides the results for the monitoring procedures 5*^ and St using estimated 
control limits. The curves c('(9) were obtained by simulating from the limit laws. Overall, 5"^ 
performed well. The performance of the t-type procedure is disappointing. When inspecting 
the CARL values, the results seem to be mysterious. E.g. when comparing the CARL 
for p = 0.95 and p = 0.9 if /3 = 0, the procedure seems to misbehave. To explore the 
reason. Figure [1] provides a part of the distribution of St — k + 1. It can be seen that the 
percentage of simulated trajectories leading to immediate detection increases considerably, 
but the contribution of these cases to the calculation of the CARL is negligible. The 
other trajectories yielding a signal are hard to detect, and the signals are spread over the 
remaining time points with many late signals, which suffice to yield large CARL values. 
This fact shows that a single number as the CARL can not summarized the statistical 
behavior sufficiently. It highlights the benefit that the random walk null hypothesis can 
often be rejected very early. 

The simulation results for the control charts using transformed statistics are summarized 
in Table [21 Here we used exact control limits obtained by simulation using 20,000 repeti- 
tions. Comparing the transformation control statistics with these control limits yields quite 
accurate results if /3 = 0. The t-type version is preferable for f3 < 0. 

Comparing the methods St (using estimated control limits) and Zt (using transformed 
statistics), our results indicate that the more computer- intensive approach to use estimated 
control limits provides more accurate results. 
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Weighted DF control chart with estimated control limits, St 
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0.044 
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Table 1 . Results for the weighted DF control chart with estimated control 
limits, St- 
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transformed weighted DF control chart Zt 
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[117.5] 


[143.3] 


[130.2] 
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Table 2. Results for the transformed weighted DF control charts Zt and Zt- 
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Figure 1. Part of the distribution of 5't - A; + 1 for p = 0.95 (circles) and 
p — 0.9 (crosses). 



References 

[1] Andrews, D. W. K. (1991). Heteroscedasticity and autocorrelation consistent covariance matrix esti- 
mation. Econometrica, 59, 3, 817-858. 

[2] Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods, 2nd edition, Springer, New 
York. 

[3] Chan, N.H. and Wei, C.Z. (1987). Asymptotic inference for nearly nonstationary AR(1) processes. 

Annals of Statistics, 15, 3, 1050 - 1063. 
[4] Chan, N.H. and Wei, C.Z. (1988). Limiting distributions of least squares estimates of unstable au- 

toregressive processes. Annals of Statistics, 16, 1, 367-401. 
[5] Dickey, D. A. and Fuller, W. A. (1979). Distribution of the estimates for autoregressive time series 

with a imit root. Journal of the American Statistical Association, 74, 427-431. 
[6] Engle, R.F. and Granger, W.J. (1987). Co-integration and error correction: representation, estimation, 

and testing. Econom,etrica, 55, 2, 251-276. 
[7] Evans, G.B.A. and Savin, N.E. (1981). The calculation of the limiting distribution of the least squares 

estimator of the parameter in a random walk model. Annals of Statistics, 9, 8, 1114-1118. 
[8] Ferger, D. (1993). Nonparametric detection of changepoints for sequentially observed data. Stochastic 

Processes and their Applications, 51, 359-372. 



[9] Ferger, D. (1995). Nonparametric tests for nonstandard change-point problems. Annals of Statistics, 
23, 1848-1861. 

10] Fuller, W.A. (1976). Introduction to Statistical Time Series, Wiley, New York. 

11] Giraitis, L., Kokoszka, P., and Leipus, R. (2000). Stationary ARCH models: dependence structure 
and central limit theorem. Econometric Theory, 16, 1, 3-22. 

12] Granger, C.W.J. (1981). Some properties of time series data and their use in econometric model 
specification. Journal of Econometrics, 121-130. 

13] Huskova, M. (1999). Gradual change versus abrupt change. Journal of Statistical Planning and Infer- 
ence, 76, 109-125. 

14] Huskova, M. and Slaby, A. (2001). Permutation tests for multiple changes. Kybernetika, 37, 5, 605-622. 
15] Kwiatkowski, D., Phillips, P.C.B., Schmidt, P., and Shin, Y. (1992). Testing the null hypothesis of 
stationary against the alternative of a unit root: How sure are we that economic time series have a 
unit root? Journal of Econometrics, 54, 159-178. 
16] Lifshits, M. A. (1982). On the absolute continuity of distributions of functional of random processes. 

Theory Probab. AppL, 27, 600-607. 
17] Newey, W.K. and West, K.D. (1987). A simple positive semi-definite, heteroscedasticity and autocor- 
relation consisten covariance matrix. Econometrica, 55, 703-708. 
18] Pawlak, M., Rafajlowicz, E., and Steland, A. (2004). Detecting jumps in time series - Nonparametric 

setting. Journal of Nonparametric Statistics, 16, 329-347. 
19] Phillips, P.C.B. (1987). Time series regression with a unit root, Econometrica, 55, 2, 277-302. 
20] Rao, M.M. (1978). Asymptotic distribution of an estimator of the boundary parameter of an unstable 

process. Annals of Statistics, 6, 185-190. 
21] Rao, M.M. (1980). Correction to ,, Asymptotic distribution of an estimator of the boundary parameter 

of an unstable process. Annals of Statistics, 8, 1403. 
22] Steland, A. (2004). Sequential control of time series by functionals of kernel- weighted empirical pro- 
cesses under local alternatives. Metrika, 60, 229-249. 
23] Steland, A. (2005a). Optimal sequential kernel smoothers under local nonparametric alternatives for 

dependent processes. Journal of Statistical Planning and Inference, 132, 131-147. 
24] Steland, A. (2005b). Random walks with drift - A sequential view. Journal of Time Series Analysis, 
26, 6, 917-942. 

25] Steland, A. (2006). Monitoring procedures to detect unit roots and stationarity. forthcoming: Econo- 
metric Theory. 



32 



[26] Stock, J.H. (1994). Unit roots, structural breaks and trends. In: Handbook of Econometrics, 4, 2739- 
2841. 

[27] van der Vaart, A.W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, 
New York. 

[28] White, J.S. (1958). The Umiting distribution of the serial coefficient in the explosive case. Ann. Math. 
Statist, 29, 1188-1197. 

[29] Xiao, Z., and PhilUps, P.C.B. (2002). A CUSUM test for cointegration using regression residuals. 

Journal of Econometrics, 108, 43-61. 
[30] Yokoyama, R. (1980). Moment bounds for stationary mixing sequences. Z. Wahrscheinlichkeitstheorie 

verw. Gebiete, 52, 45-57. 



33 



